The aim of this project is to to classify traffic signs in an appropriate categories. The dataset uses the German Traffic Sign Dataset with 43 distinct categories.
This project implements the classifer using a deep learning techinique, called convolutional neural network (CNN). The project implemenation is divided in the following steps.
As stated earlier, the German Traffic Signs dataset is used. The traffic signs and their labels are pickled is a dictionary with 4 key/value pairs.
# Import useful packages
from datetime import timedelta
from IPython.display import Image
from PIL import Image
from scipy.stats import itemfreq
from sklearn.cross_validation import train_test_split # model_selection
from sklearn.preprocessing import LabelBinarizer
from sklearn.utils import resample
from tqdm import tqdm
from zipfile import ZipFile
import collections
import cv2
import math
import matplotlib.gridspec as gridspec
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import pickle
import tensorflow as tf
import time
import warnings
warnings.filterwarnings("ignore")
Let's first read the pickle file to load training and test features and label data. X_train and X_test refer to features (sign images) while y_train and y_test refer to label (sign name).
# Load pickled data
import pickle
# TODO: fill this in based on where you saved the training and testing data
directory = "./traffic-signs-data"
training_file = directory + "/train.p"
testing_file = directory + "/test.p"
with open(training_file, mode='rb') as f:
train = pickle.load(f)
with open(testing_file, mode='rb') as f:
test = pickle.load(f)
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']
Now, let's load the data dictionary for sign names and check what does each category refer to.
# Read traffic signal data dictionary
ts_data_dictionary = pd.read_csv("./signnames.csv")
ts_data_dictionary
# number of training examples
n_train = len(X_train)
# number of testing examples
n_test = len(X_test)
# shape of an image
image_shape = (X_train[0].shape)
# Number of classes are in the dataset
n_classes = len(np.unique(y_train))
# image size in pixels
imgsize = X_train[0].shape[1]
print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
# Bar chart of traffic sign labels
label_freq = itemfreq(y_train)
def plot_tf_freq(label):
"""Plot frequency of each traffic sign"""
y_train = label
label_freq = itemfreq(y_train)
fig = plt.figure(figsize=(8,4))
plt.bar(label_freq[:,0], label_freq[:,1], align='center', alpha=0.5)
plt.xlabel("Traffic Sign Labels")
plt.ylabel("Frequency")
plt.title("Frequency of Traffic Sign Labels")
plt.grid(True)
axes = plt.gca()
axes.set_xlim([-1,44])
axes.set_ylim([0,2600])
plt.gcf().set_size_inches(8,4)
plt.show()
# fig.savefig("TrafficSignal_Frequency.png")
plot_tf_freq(y_train)
# Statistics of the data
label_min = np.min(label_freq[:,1])
label_max = np.max(label_freq[:,1])
label_average = n_train/n_classes
# Calcualate # underrepresented label category below certain thresholds
threshold = 500
num_underrepresented_labels = np.sum(label_freq[:,1] <= threshold)
print("Minimum sample size in a category =", label_min)
print("Maximum sample size in a category =", label_max)
print("Average samples in a category =", label_average)
print("# Under-represented samples below 500 thresold =", num_underrepresented_labels)
The statistics and plot indicate that the data is unbalanced across different traffic sign classes. In order to make it a balanced class problem, additional dataset is generated in the next sub-section.
# Grid plot images
rows = n_classes
cols = 5
def gridplot_ts(ts_dict, X_train, filename):
"""Plot sample images from each category"""
fig = plt.figure()
gs1 = gridspec.GridSpec(rows, cols)
gs1.update(wspace=0.01, hspace=0.02) # set the spacing between axes.
scaling = 4
plt.figure(figsize=(1*scaling,9*scaling))
for label in np.arange(rows):
for i in np.arange(cols):
ax1 = plt.subplot(gs1[label*cols+i])
ax1.set_xticklabels([])
ax1.set_yticklabels([])
ax1.set_aspect('equal')
idx = np.random.choice(ts_dict[label], 1, replace=False)[0]
plt.subplot(rows,cols,label*cols+i+1)
plt.imshow(X_train[idx])
plt.axis('off')
fig.savefig(filename)
plt.show()
# Build traffic signal dictionary index to find out
# which image belong to a sign category
ts_dict_index = {}
for numlabel in np.arange(n_classes):
ts_dict_index[numlabel] = []
for i in np.where(y_train==numlabel)[0]:
ts_dict_index[numlabel].append(i)
gridplot_ts(ts_dict_index, X_train, "original_ts.png")
As the data is read, it's time to preprocess the data. Grayscaling and image normalization is performed. In order to overcome the imbalance class problem, additional data is generated from the original training data.
In particular, grayscale is chosen over RGB space as it averages color pixel information from all the color spaces.
Image normalization is also an important step to bring the pixel intensities of different images in the threshold [0.0, 1.0] with mean 0.5. This is particularly useful in implementing the same algorithm to different images and in numerical stablilzation.
In the preprocessing step, images are convered to grayscale and then normalized.
#%% Helper Functions
def grayscale(img):
"""Applies the Grayscale transform
This will return an image with only one color channel
To see the returned image as grayscale
you should call plt.imshow(gray, cmap='gray')"""
return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
def normalize_image(img):
"""Normalize the image"""
a = 0
b = 1
greyscale_min = 0
greyscale_max = 255
return a + ( ( (img - greyscale_min)*(b - a) )/( greyscale_max - greyscale_min ) )
def img_data_normalization(img_data):
"""Perform greyscaling and normalization of images"""
img_data_gray_temp = []
for i in np.arange(len(img_data)):
img_data_gray_temp.append(grayscale(img_data[i]))
img_data_gray = np.asarray(img_data_gray_temp)
img_data_normalized = normalize_image(img_data_gray)
del img_data_gray_temp, img_data_gray
return img_data_normalized
# Grayscale and normalize train and test images
X_train_gray = img_data_normalization(X_train)
X_test_gray = img_data_normalization(X_test)
print('Images Preprocessed: Grayscaled and Normalized')
# Turn labels into numbers and apply One-Hot Encoding
encoder = LabelBinarizer()
encoder.fit(y_train)
train_labels = encoder.transform(y_train).astype(np.float32)
test_labels = encoder.transform(y_test).astype(np.float32)
print('Labels converted to one-hot encoded vector')
The additional data is generated from training dataset by randomly picking images and transforming them using translation, angular and shear transformations.
# Helper function to transform, rotate and shear an image
# Adopted from https://carnd-udacity.atlassian.net/wiki/questions/10322627/project-2-unbalanced-data-generating-additional-data-by-jittering-the-original-image
def transform_image(img,ang_range,shear_range,trans_range):
'''
This function transforms images to generate new images.
The function takes in following arguments,
1- Image
2- ang_range: Range of angles for rotation
3- shear_range: Range of values to apply affine transform to
4- trans_range: Range of values to apply translations over.
A Random uniform distribution is used to generate different parameters for transformation
'''
# Rotation
ang_rot = np.random.uniform(ang_range)-ang_range/2
rows,cols,ch = img.shape
Rot_M = cv2.getRotationMatrix2D((cols/2,rows/2),ang_rot,1)
# Translation
tr_x = trans_range*np.random.uniform()-trans_range/2
tr_y = trans_range*np.random.uniform()-trans_range/2
Trans_M = np.float32([[1,0,tr_x],[0,1,tr_y]])
# Shear
pts1 = np.float32([[5,5],[20,5],[5,20]])
pt1 = 5+shear_range*np.random.uniform()-shear_range/2
pt2 = 20+shear_range*np.random.uniform()-shear_range/2
pts2 = np.float32([[pt1,5],[pt2,pt1],[5,pt2]])
shear_M = cv2.getAffineTransform(pts1,pts2)
img = cv2.warpAffine(img,Rot_M,(cols,rows))
img = cv2.warpAffine(img,Trans_M,(cols,rows))
img = cv2.warpAffine(img,shear_M,(cols,rows))
return img
Now, it is a time to figure out how many additional samples are to be generated for each class label. From the frequency graph of labels in the above section, class # 2 (Speed limit (50km/h)) has highest frequency of 2250. Instead of picking the maximum frequency as maximum number of samples in each class, the maximum number of sample in each class is drawn from a normal distribution with mean equal to (2250 + delta), where delta is a number of choice. For the exercise delta is chosen as 250.

# Count of new data to be generated in each category
from scipy.stats import norm
max_data_size = label_max+250
scaling_factor = 25
ts_max_labels = np.rint(norm.rvs(loc = max_data_size, size = n_classes, scale = scaling_factor))
new_data_labels_count = (ts_max_labels - label_freq[:,1])
new_data_mul_factor = (new_data_labels_count/label_freq[:,1])
new_counts = (label_freq[:,1]*new_data_mul_factor)
The frequency for original and generated data is plotted.
# plot count of original and new data
def plot_new_data(label, new_label):
x = range(len(label))
fig = plt.figure(figsize=(8,4))
p1= plt.bar(x, label, align='center', alpha=0.5)
p2 = plt.bar(x, new_label, align='center', alpha=0.5, color='r',
bottom=label)
plt.xlabel("Traffic Sign Labels")
plt.ylabel("Frequency")
plt.title("Frequency of Traffic Sign Labels")
plt.grid(True)
plt.legend((p1[0], p2[0]), ('Original Data', 'Generated Data'))
axes = plt.gca()
axes.set_xlim([-1,44])
axes.set_ylim([0,3400])
# plt.gcf().set_size_inches(8,4)
plt.show()
plot_new_data(itemfreq(y_train)[:,1], new_counts.astype(int))
Now, we know additional number of dataset to be generated in each class. An image is randomly picked and additional data is generated by using translation, angular and shear transformations on an image.
#%% Generate additional data - Updated
angle_range = 15
shear_range = 5
translation_range = 5
X_train_temp = X_train
y_train_temp = y_train
new_X_train = X_train_temp
new_y_train = y_train_temp
newts_dict_index = {} #ts_dict_index
inc_index = len(X_train_temp)
threshold = 0.2
for label in np.arange(rows): #rows
newts_dict_index[label] = []
dummy_array = np.array([np.zeros(X_train_temp[0].shape)])
dummy_y = np.array([np.zeros(y_train_temp[0].shape)])
for i in np.arange(new_counts[label]): #new_data_labels_count
idx = np.random.choice(ts_dict_index[label], 1, replace=False)[0]
img = X_train_temp[idx]
newimg = transform_image(img,angle_range,shear_range,translation_range)
dummy_array = np.concatenate((dummy_array,[newimg]))
dummy_y = np.concatenate((dummy_y,[label]))
inc_index = inc_index + 1
newts_dict_index[label].append(inc_index)
dummy_array = np.delete(dummy_array,0,0)
dummy_y = np.delete(dummy_y,0,0)
new_X_train = np.concatenate((new_X_train,dummy_array))
new_y_train = np.concatenate((new_y_train,dummy_y))
print(">>>>> New Data generation Completed.")
Randomly seleted samples of additional dataset is plotted below.
# Plot sample generated images
gridplot_ts(newts_dict_index, new_X_train, "new_ts.png")
# Convert images to float32 and grayscale and normalize the images
new_X_train = np.array(new_X_train, dtype=np.float32)
new_y_train = np.array(new_y_train, dtype=np.float32)
new_X_train_gray = img_data_normalization(new_X_train)
# Get randomized datasets for training, validation and test
train_features, valid_features, train_labels, valid_labels = train_test_split(
new_X_train_gray,
new_y_train,
test_size=0.1,
random_state=832289)
test_features = X_test_gray
test_labels = y_test
print('Training features and labels randomized and split.')
freq_train = itemfreq(train_labels)[:,1]
# Plot Frequencies of Training, Validation and TEst Datasets
freq_train = itemfreq(train_labels)[:,1]
freq_test = itemfreq(test_labels)[:,1]
freq_valid = itemfreq(valid_labels)[:,1]
def plot_freq_data(freq_train, freq_valid, freq_test):
x = range(len(freq_train))
fig = plt.figure(figsize=(8,4))
p1= plt.bar(x, freq_train, align='center', alpha=0.5)
p2 = plt.bar(x, freq_valid, align='center', alpha=0.5, color='r',
bottom=freq_train)
p3 = plt.bar(x, freq_test, align='center', alpha=0.5, color='g')
plt.xlabel("Traffic Sign Labels")
plt.ylabel("Frequency")
plt.title("Frequency of Traffic Sign Labels")
plt.grid(True)
plt.legend((p1[0], p2[0], p3[0]), ('Training Data', 'Validation Data', 'Test Data'))
axes = plt.gca()
axes.set_xlim([-1,44])
axes.set_ylim([0,3700])
# plt.gcf().set_size_inches(8,4)
plt.show()
plot_freq_data(freq_train, freq_valid, freq_test)
#%% Flattening and One-Hot encoding
imgsize = X_train[0].shape[1]
#Flattening images
train_features = np.reshape(train_features,(len(train_features),imgsize*imgsize)).astype(np.float32)
test_features = np.reshape(test_features,(len(test_features),imgsize*imgsize)).astype(np.float32)
valid_features = np.reshape(valid_features,(len(valid_features),imgsize*imgsize)).astype(np.float32)
encoder = LabelBinarizer()
encoder.fit(train_labels)
train_labels = encoder.transform(train_labels).astype(np.float32)
test_labels = encoder.transform(test_labels).astype(np.float32)
valid_labels = encoder.transform(valid_labels).astype(np.float32)
# Dump the generate data to pickle file
pickle_file = './traffic-signs-data/traffic_signal.p'
if not os.path.isfile(pickle_file):
print('Saving data to pickle file...')
try:
with open(pickle_file, 'wb') as pfile:
pickle.dump(
{
'train_dataset': train_features,
'train_labels': train_labels,
'valid_dataset': valid_features,
'valid_labels': valid_labels,
'test_dataset': test_features,
'test_labels': test_labels
},
pfile, pickle.HIGHEST_PROTOCOL)
except Exception as e:
print('Unable to save data to', pickle_file, ':', e)
raise
print('Data cached in pickle file.')
# Read the Pickle File
pickle_file = './traffic-signs-data/traffic_signal.p'
with open(pickle_file, 'rb') as f:
pickle_data = pickle.load(f)
train_features = pickle_data['train_dataset']
train_labels = pickle_data['train_labels']
valid_features = pickle_data['valid_dataset']
valid_labels = pickle_data['valid_labels']
test_features = pickle_data['test_dataset']
test_labels = pickle_data['test_labels']
del pickle_data # Free up memory
print('Data and modules loaded.')
The helper functions are written to plot image classification errors.
# Plot images
#%% Adopted from the Hvass lab and modified
# https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/02_Convolutional_Neural_Network.ipynb
def plot_images(images, cls_true, rows, cols, cls_pred=None):
img_shape = (imgsize, imgsize)
# Create figure with 3x3 sub-plots.
fig, axes = plt.subplots(rows, cols)
for i, ax in enumerate(axes.flat):
# Plot image.
ax.imshow(images[i].reshape(img_shape), cmap='binary')
# Show true and predicted classes.
if cls_pred is None:
xlabel = "True: {0}".format(cls_true[i])
else:
xlabel = "True: {0}\n Pred: {1}".format(cls_true[i], cls_pred[i])
# Show the classes as the label on the x-axis.
ax.set_xlabel(xlabel)
# Remove ticks from the plot.
ax.set_xticks([])
ax.set_yticks([])
fig.tight_layout()
plt.show()
#%% Plot example errors
#%% Adopted from the Hvass lab and modified
# https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/02_Convolutional_Neural_Network.ipynb
def plot_example_errors(images, cls_true, cls_pred, rows, cols):
# This function is called from print_test_accuracy() below.
# cls_pred is an array of the predicted class-number for
# all images in the test-set.
# correct is a boolean array whether the predicted class
# is equal to the true class for each image in the test-set.
# Negate the boolean array.
rc = rows*cols
correct = (cls_true == cls_pred)
incorrect = (correct == False)
# Get the images from the test-set that have been
# incorrectly classified.
images = images[incorrect][0:rc]
# Get the predicted classes for those images.
cls_pred = cls_pred[incorrect][0:rc]
# Get the true classes for those images.
cls_true = cls_true[incorrect][0:rc]
# Plot the first 9 images.
plot_images(images, cls_true, rows, cols, cls_pred)
As the regular neural nets don’t scale well to full images, a convolutional neural network (CNN) is used to extract features and later classify traffic sign images. As the features are be non-linear, a single layer CNN won't capture it. Hence a two-layered architecture as shown in the depiciton figure is used.
The details of the layers are as follows:
An Adam optimizer is used to minimize the loss function. Dropout is used in training model to avoid overfitting.

# Define Tensorflow placeholders
features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)
keep_prob = tf.placeholder(tf.float32)
# Define feed dictionaries for train, validation, test data
train_feed_dict = {features: train_features, labels: train_labels, keep_prob: 1}
valid_feed_dict = {features: valid_features, labels: valid_labels, keep_prob: 1}
test_feed_dict = {features: test_features, labels: test_labels, keep_prob: 1}
A wrapper to define CNN is implemented.
#%% Adapted from https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/convolutional_network.py
# Create some wrappers for simplicity
def conv2d(x, W, b, strides=1):
# Conv2D wrapper, with bias and relu activation
x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
x = tf.nn.bias_add(x, b)
return tf.nn.relu(x)
def maxpool2d(x, k=2):
# MaxPool2D wrapper
return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1],
padding='SAME')
# Create CNN model
def conv_net(x, weights, biases, dropout):
# Reshape input picture
x = tf.reshape(x, shape=[-1, imgsize, imgsize, 1])
# Convolution Layer
conv1 = conv2d(x, weights['wc1'], biases['bc1'])
# Max Pooling (down-sampling)
conv1 = maxpool2d(conv1, k=2)
conv1 = tf.nn.dropout(conv1, dropout)
# Convolution Layer
conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
# Max Pooling (down-sampling)
conv2 = maxpool2d(conv2, k=2)
conv2 = tf.nn.dropout(conv2, dropout)
# Fully connected layer
# Reshape conv2 output to fit fully connected layer input
fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
fc1 = tf.nn.relu(fc1)
# Apply Dropout
fc1 = tf.nn.dropout(fc1, dropout)
# Output, class prediction
out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
return out
# https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/convolutional_network.py
# Define CNN layer properties
ksize= 5 # kernel size
layer1_kernels = 64 # Number of kernels in layer 1
layer2_kernels = 128 # Number of kernels in layer 2
layer_fc = 1024 # Number of Neurons in fully connected layer
# constants
meanv = 0.0
stddevv = 0.02
# Store layers weight & bias
weights = {
# 5x5 conv, 1 input, 32 outputs
'wc1': tf.Variable(tf.truncated_normal([ksize, ksize, 1, layer1_kernels], mean=meanv, stddev=stddevv, dtype=tf.float32)), # 128
# 5x5 conv, 32 inputs, 64 outputs
'wc2': tf.Variable(tf.truncated_normal([ksize, ksize, layer1_kernels, layer2_kernels], mean=meanv, stddev=stddevv, dtype=tf.float32)), # 256
# fully connected, 8*8*64 inputs, 1024 outputs
'wd1': tf.Variable(tf.truncated_normal([8*8*layer2_kernels, layer_fc], mean=meanv, stddev=stddevv, dtype=tf.float32)), # 1024
# 1024 inputs, 10 outputs (class prediction)
'out': tf.Variable(tf.truncated_normal([layer_fc, n_classes], mean=meanv, stddev=stddevv, dtype=tf.float32))
}
biases = {
'bc1': tf.Variable(tf.zeros([layer1_kernels])),
'bc2': tf.Variable(tf.zeros([layer2_kernels])),
'bd1': tf.Variable(tf.zeros([layer_fc])),
'out': tf.Variable(tf.zeros([n_classes]))
}
# Construct model
pred = conv_net(features, weights, biases, keep_prob)
#############################
# Linear Model
#weights = tf.Variable(tf.truncated_normal((n_input, n_classes)))
#biases = tf.Variable(tf.zeros(n_classes))
#logits = tf.matmul(features, weights) + biases
#logits = -np.amax(logits)
#pred = tf.nn.softmax(logits)
##############################
# Define loss and optimizer
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(pred, labels)
#cross_entropy = tf.reduce_mean(-tf.reduce_sum(labels * tf.log(tf.clip_by_value(pred,1e-10,1.0)), reduction_indices=[1]))
loss = tf.reduce_mean(cross_entropy)
# Evaluate model
y_pred_cls = tf.argmax(pred,1)
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
softmax_pred = tf.nn.softmax(pred)
softmax_pred_top5 = tf.nn.top_k(softmax_pred, k=5, sorted=True)
# Tensorflow GPU configruation setup
config = tf.ConfigProto()
config.gpu_options.allocator_type = 'BFC'
def model_training(epochs = 2, batch_size = 128, learning_rate = 0.001):
# time
start_time = time.time()
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
# The accuracy measured against the validation set
validation_accuracy = 0.0
init = tf.global_variables_initializer()
# Measurements use for graphing loss and accuracy
log_batch_step = 50
batches = []
loss_batch = []
train_acc_batch = []
valid_acc_batch = []
#######################################################
with tf.Session(config = config) as session:
session.run(init)
batch_count = int(math.ceil(len(train_features)/batch_size))
for epoch_i in range(epochs):
print('Epoch {:>2}/{} '.format(epoch_i+1, epochs)+'#'*40)
# Progress bar
# batches_pbar = tqdm(range(batch_count), desc='Epoch {:>2}/{}'.format(epoch_i+1, epochs), unit='batches')
batches_pbar = range(batch_count)
# The training cycle
for batch_i in batches_pbar:
# Get a batch of training features and labels
batch_index = np.random.choice(len(train_features),batch_size, replace=False)
batch_features = train_features[batch_index]
batch_labels = train_labels[batch_index]
# Run optimizer and get loss
_, lossi = session.run( [optimizer, loss], feed_dict={features: batch_features, labels: batch_labels, keep_prob: 0.5})
# Log every 50 batches
if not batch_i % log_batch_step:
# Calculate Training and Validation accuracy
training_accuracy = session.run(accuracy, feed_dict={features: batch_features, labels: batch_labels, keep_prob: 1.})
validation_accuracy = session.run(accuracy, feed_dict=valid_feed_dict)
print('Batch # = {:>4} : Loss : {:19.16f}, Training accuracy = {:f} : Validation accuracy = {:f}'.format(batch_i, lossi, training_accuracy, validation_accuracy))
# Log batches
previous_batch = batches[-1] if batches else 0
batches.append(log_batch_step + previous_batch)
loss_batch.append(lossi)
train_acc_batch.append(training_accuracy)
valid_acc_batch.append(validation_accuracy)
# Check accuracy against Validation data
# train_acc = session.run(accuracy, feed_dict=train_feed_dict)
validation_accuracy = session.run(accuracy, feed_dict=valid_feed_dict)
loss_plot = plt.subplot(211)
loss_plot.set_title('Loss')
loss_plot.plot(batches, loss_batch, 'g')
loss_plot.set_xlim([batches[0], batches[-1]])
acc_plot = plt.subplot(212)
acc_plot.set_title('Accuracy')
acc_plot.plot(batches, train_acc_batch, 'r', label='Training Accuracy')
acc_plot.plot(batches, valid_acc_batch, 'b', label='Validation Accuracy')
acc_plot.set_ylim([0, 1.0])
acc_plot.set_xlim([batches[0], batches[-1]])
acc_plot.legend(loc=4)
plt.tight_layout()
plt.show()
print('Training accuracy = {:f} : Validation accuracy = {:f}'.format(training_accuracy, validation_accuracy))
# end time
end_time = time.time()
time_delta = end_time - start_time
print("Time usage : "+ str(timedelta(seconds=int(round(time_delta)))))
print('#'*80)
Now, it is time to train a CNN model. The parameters (learning_rate, epoch and batch_size) are tuned to pick the best combination of parameters.
# Train the system
# learning_rate parameter space
learning_rate_list = [0.5, 0.1, 0.01, 0.001, 0.0001]
print('>>>>>>>>> Model Training for learning_rate = ', learning_rate_list)
for learning_rate in learning_rate_list:
print('>>>>>>>>>>>>>>>>>>>>> learning_rate = ', learning_rate)
model_training(epochs = 1, batch_size = 128, learning_rate = learning_rate)
# Train the system
# learning_rate parameter space
batch_size_list = [64, 128, 256]
print('>>>>>>>>> Model Training for batch_size = ', batch_size_list)
for batch_size in batch_size_list:
print('>>>>>>>>>>>>>>>>>>>>> learning_rate = ', batch_size)
model_training(epochs = 1, batch_size = batch_size, learning_rate = 0.001)
# Train the system
# Epochs parameter space
epochs_list = [1, 2, 3, 4, 5]
print('>>>>>>>>> Model Training for epochs = ', epochs_list)
for epochs in epochs_list:
print('>>>>>>>>>>>>>>>>>>>>> Epoch = ', epochs)
model_training(epochs = epochs, batch_size = 128, learning_rate = 0.001)
Based on the above studies, I selected the following parameters for testing sets to avoid overfitting.
The model is trained with the selected parameters and validation accuracy is calculated.
model_training(epochs = 4, batch_size = 128, learning_rate = 0.001)
The final training accuracy is ~ 99% and the validation accuracy is ~98%.
def model_testing(epochs = 4, batch_size = 128, learning_rate = 0.001):
# time
start_time = time.time()
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
# The accuracy measured against the test set
test_accuracy = 0.0
init = tf.global_variables_initializer()
# Measurements use for graphing loss and accuracy
log_batch_step = 50
batches = []
loss_batch = []
train_acc_batch = []
test_acc_batch = []
#######################################################
with tf.Session(config = config) as session:
session.run(init)
batch_count = int(math.ceil(len(train_features)/batch_size))
for epoch_i in range(epochs):
print('Epoch {:>2}/{} '.format(epoch_i+1, epochs)+'#'*40)
# Progress bar
batches_pbar = range(batch_count)
# The training cycle
for batch_i in batches_pbar:
# Get a batch of training features and labels
batch_index = np.random.choice(len(train_features),batch_size, replace=False)
batch_features = train_features[batch_index]
batch_labels = train_labels[batch_index]
# Run optimizer and get loss
_, lossi = session.run( [optimizer, loss], feed_dict={features: batch_features, labels: batch_labels, keep_prob: 0.5})
# Log every 50 batches
if not batch_i % log_batch_step:
# Calculate Training and Validation accuracy
training_accuracy = session.run(accuracy, feed_dict={features: batch_features, labels: batch_labels, keep_prob: 1.})
test_accuracy = session.run(accuracy, feed_dict=test_feed_dict)
print('Batch # = {:>4} : Loss : {:19.16f}, Training accuracy = {:f} : Test accuracy = {:f}'.format(batch_i, lossi, training_accuracy, test_accuracy))
# Log batches
previous_batch = batches[-1] if batches else 0
batches.append(log_batch_step + previous_batch)
loss_batch.append(lossi)
train_acc_batch.append(training_accuracy)
test_acc_batch.append(test_accuracy)
# Check accuracy against Validation data
test_accuracy = session.run(accuracy, feed_dict=test_feed_dict)
y_pred = session.run(y_pred_cls, feed_dict=test_feed_dict)
cm = tf.contrib.metrics.confusion_matrix(tf.argmax(test_labels, 1), y_pred, num_classes=n_classes, dtype=tf.int32).eval()
loss_plot = plt.subplot(211)
loss_plot.set_title('Loss')
loss_plot.plot(batches, loss_batch, 'g')
loss_plot.set_xlim([batches[0], batches[-1]])
acc_plot = plt.subplot(212)
acc_plot.set_title('Accuracy')
acc_plot.plot(batches, train_acc_batch, 'r', label='Training Accuracy')
acc_plot.plot(batches, test_acc_batch, 'b', label='Test Accuracy')
acc_plot.set_ylim([0, 1.0])
acc_plot.set_xlim([batches[0], batches[-1]])
acc_plot.legend(loc=4)
plt.tight_layout()
plt.show()
print('>>>>> Test accuracy = {:f}'.format(test_accuracy))
print('Confusion Matrix : \n', cm)
# end time
end_time = time.time()
time_delta = end_time - start_time
print("Time usage : "+ str(timedelta(seconds=int(round(time_delta)))))
print('#'*80)
return test_accuracy, y_pred, cm
# Testing the model
print('>>>>>>>>> Model Testing ')
test_accuracy, y_pred, cm = model_testing(epochs = 4, batch_size = 128, learning_rate = 0.001)
A test accuracy of ~94% is achieved.
Now let's plot some of the mis-classified images.
# Plot example errors in testing
images = test_features
cls_true = test_labels
cls_true = np.argmax(cls_true, 1)
cls_pred = y_pred
rows = 4
cols = 4
plot_example_errors(images, cls_true, cls_pred, rows, cols)
A new data is genererated by extracting traffic signal images from Gram Dataset and Google Images. Nine images are saved in a directory and read. A following procedure is used to create a simple processing pipeline:
#%% Read new dataset
# Iterate through the list of images
imagelist = ["newts/"+imgname for imgname in os.listdir("newts/")]
newts_features = np.array([np.zeros((imgsize,imgsize))])
for image in imagelist:
if image.endswith(".png"):
# Read and preprocess the image
image_original = mpimg.imread(image)
img = (np.copy(image_original)*255).astype('uint8')
croppedImage = cv2.resize(img, (imgsize, imgsize))
gray = grayscale(croppedImage)
nimg = normalize_image(gray)
newts_features = np.concatenate((newts_features,[nimg]))
newts_features = np.delete(newts_features,0,0)
newts_labels = np.array([31, 20, 27, 40, 25, 0, 1, 14, 28])
newts_features = np.reshape(newts_features,(len(newts_features),imgsize*imgsize)).astype(np.float32)
newts_labels = encoder.transform(newts_labels).astype(np.float32)
print('Labels converted to one-hot encoded vector')
| Image | Correct Label | Comment | Prediction Estimate |
|
31 | Wild animal crossing. Similar images occurs in the database. | Easy |
|
20 | Dangerous curve to the right. Similar images occurs in the database. | Easy |
|
27 | This is a pedestrian image taken from pixelated traffic signal. The person is moving right instead of left in the database. Due to pixelation, this is a difficult image to classify. | Difficult |
|
40 | Roundabout mandatory. Similar images occurs in the database. | Easy |
|
25 | Road work. This image has yellow background in contrast to white one in the database. | Easy |
|
0 | This is a purposely miscalssified image. No such images exist in the database. | Difficult |
|
1 | Speed limit (30km/h). Similar images occurs in the database. | Easy |
|
14 | Stop. The shadow on the image may make it difficult to classify | Difficult |
|
28 | Children crossing. The image in the database is slightly different. This may make the classification difficult. | Difficult |
Here are the gray colored plots of images with their true labels.
#%% plot images - new test dataset
rows = 3
cols = rows
rc = rows*cols
images = newts_features
cls_true = newts_labels
cls_true = np.argmax(cls_true, 1)
plot_images(images, cls_true, rows, cols)
# Feed dictionary
newts_feed_dict = {features: newts_features, labels: newts_labels, keep_prob: 1}
def model_testing_newdata(epochs = 4, batch_size = 128, learning_rate = 0.001):
start_time = time.time()
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
init = tf.global_variables_initializer()
# Measurements use for graphing loss and accuracy
log_batch_step = 50
batches = []
loss_batch = []
train_acc_batch = []
test_acc_batch = []
#######################################################
with tf.Session(config = config) as session:
session.run(init)
batch_count = int(math.ceil(len(train_features)/batch_size))
for epoch_i in range(epochs):
print('Epoch {:>2}/{} '.format(epoch_i+1, epochs)+'#'*40)
# Progress bar
batches_pbar = range(batch_count)
# The training cycle
for batch_i in batches_pbar:
# Get a batch of training features and labels
batch_index = np.random.choice(len(train_features),batch_size, replace=False)
batch_features = train_features[batch_index]
batch_labels = train_labels[batch_index]
# Run optimizer and get loss
_, lossi = session.run( [optimizer, loss], feed_dict={features: batch_features, labels: batch_labels, keep_prob: 0.5})
# Log every 50 batches
if not batch_i % log_batch_step:
# Calculate Training and Validation accuracy
training_accuracy = session.run(accuracy, feed_dict={features: batch_features, labels: batch_labels, keep_prob: 1.})
test_accuracy = session.run(accuracy, feed_dict=newts_feed_dict)
print('Batch # = {:>4} : Loss : {:19.16f}, Training accuracy = {:f} : Test accuracy = {:f}'.format(batch_i, lossi, training_accuracy, test_accuracy))
# Log batches
previous_batch = batches[-1] if batches else 0
batches.append(log_batch_step + previous_batch)
loss_batch.append(lossi)
train_acc_batch.append(training_accuracy)
test_acc_batch.append(test_accuracy)
# Check accuracy against Validation data
newts_accurary = session.run(accuracy, feed_dict=newts_feed_dict)
y_pred_newts = session.run(y_pred_cls, feed_dict=newts_feed_dict)
softmax_newts = session.run(softmax_pred, feed_dict=newts_feed_dict)
softmax_pred_top5_newts = session.run(softmax_pred_top5, feed_dict=newts_feed_dict)
loss_plot = plt.subplot(211)
loss_plot.set_title('Loss')
loss_plot.plot(batches, loss_batch, 'g')
loss_plot.set_xlim([batches[0], batches[-1]])
acc_plot = plt.subplot(212)
acc_plot.set_title('Accuracy')
acc_plot.plot(batches, train_acc_batch, 'r', label='Training Accuracy')
acc_plot.plot(batches, test_acc_batch, 'b', label='Test Accuracy')
acc_plot.set_ylim([0, 1.0])
acc_plot.set_xlim([batches[0], batches[-1]])
acc_plot.legend(loc=4)
plt.tight_layout()
plt.show()
print('Training accuracy = {:f} : Validation accuracy = {:f}'.format(training_accuracy, test_accuracy))
# end time
end_time = time.time()
time_delta = end_time - start_time
print("Time usage : "+ str(timedelta(seconds=int(round(time_delta)))))
return newts_accurary, y_pred_newts, softmax_newts, softmax_pred_top5_newts
# Testing the model on new dataset
print('>>>>>>>>> Model Testing on New Dataset')
newts_accurary, y_pred_newts, softmax_newts, softmax_pred_top5_newts = model_testing_newdata(epochs = 3, batch_size = 128, learning_rate = 0.001)
The validation and test accuracy on the new test dataset are ~98% and ~44%, respectively. The deep learning model is not doing a good job in classifying the new dataset.
Let's plot the prediction.
#%% plot images - new test dataset
rows = 3
cols = rows
rc = rows*cols
images = newts_features
cls_true = newts_labels
cls_true = np.argmax(cls_true, 1)
cls_pred = y_pred_newts
plot_images(images, cls_true, rows, cols, cls_pred)
# Print probabilities
for nimg in range(len(newts_labels)):
print('Image # ',nimg)
img = newts_features[nimg].reshape((imgsize, imgsize))
plt.imshow(img, cmap='gray')
plt.axis('off')
plt.show()
print('True class : ',np.argmax(newts_labels[nimg]))
print('p_class\tsoftmax_p')
for nprob in range(len(softmax_pred_top5_newts[0][0])):
print('{:>7}\t{:>9.5f}'.format(softmax_pred_top5_newts[1][nimg][nprob], softmax_pred_top5_newts[0][nimg][nprob]))
print('#'*80)
| Image | Correct Label | Predicted Labels (top 5) | Discussion |
|
31 | [31, 23, 21, 29, 19] | The classifier correctly classifed the image as class 31 with ~53.3% probability. |
|
20 | [20, 28, 27, 11, 22] | The image is correctly classified as class 20 with ~51.1% confidence. |
|
27 | [12, 23, 38, 11, 21] | As estimate earlier, this is a difficult image to classify. The classifier does not predict the class in top 5 predictions. |
|
40 | [29, 24, 22, 28, 23] | As per earlier estimate, this is an easy image to classify. However, the image is incorrectly classified as 29. In the top 5 predictions, the classifier does not predict the correct class. |
|
25 | [25, 30, 11, 22, 28] | The classifier correctly classifies the image with ~41% confidence. |
|
0 | [16, 40, 5, 3, 10] | As per the earlier estimate, the image is not correctly classified. It is incorrectly classified as class 16. |
|
1 | [7, 5, 8, 2, 10] | This prediction is strange. This is supposed to be easy classification. However, the classifier incorrectly predicts it as class 7. The classifier does not predict correct class in the top 5 predictions. |
|
14 | [14, 38, 4, 0, 34] | As per the earlier estimate, this is supposed to be a difficult image to classify. However, the classifer predicts the correct class with ~92.2% confidence. |
|
28 | [23, 29, 28, 22, 19] | As per the earlier estimate, this is supposed to be a difficult image to classify. Indeed, it is difficult to classify. The classifier does not predict it correctly. The correct label 28 appears in the top 5 predictions, however with mere ~0.01% confidence. |
It is interesting to note that the model does not classify the new images on par with the testing dataset accuracy.
Answer:
The answer is provided in the section =>
Step 2: Preprocess and Generate New Data
Preprocess images (grayscaling and normalization)
Describe how you set up the training, validation and testing data for your model. If you generated additional data, why?
Answer:
The answer is provided in the section =>
Step 2: Preprocess and Generate New Data
Generate Additional Dataset
What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.)
Answer:
The answer is provided in the section =>
Step 3: Design and Test a CNN Model Architecture
How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)
Answer:
The answer is provided in the section =>
Step 3: Design and Test a CNN Model Architecture
CNN Training
What approach did you take in coming up with a solution to this problem?
Answer:
The CNN model is trained for different parameters such as learning_rate, epochs and batch_size. Parametic studies are conducted for each of these parameters and best ones are chosen to avoid overfitting or underfitting.
Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.
Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.
The implemenation is described in
Step 4: Testing CNN Model on New Images
Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It would be helpful to plot the images in the notebook.
Answer:
The answer is described in
Step 4: Testing CNN Model on New Images
Observations
Is your model able to perform equally well on captured pictures or a live camera stream when compared to testing on the dataset?
Answer:
On live stream/captured images, the model will not perform on par with the testing dataset. The reason being the training/testing dataset is small. The model need to be build for a bigger datasets in different lighting conditions.
Use the model's softmax probabilities to visualize the certainty of its predictions, tf.nn.top_k could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)
Answer:
The implemenation is described in
Step 4: Testing CNN Model on New Images
Calculating SoftMax Probabilities
If necessary, provide documentation for how an interface was built for your model to load and classify newly-acquired images.
Answer:
The pipelineis described in
Step 4: Testing CNN Model on New Images